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Abstract 

The phenomenon of six degrees of separation is an old but attractive subject. The deep 
understanding has been uncovered yet, especially how closed paths included in a network affect 
six degrees of separation are an important subject left yet. For it, some researches have been 
madefl], [2]. Recently we have develop a formalism 20 , 2T] to explore the subject based on 
the string formalism developed by Aoyama[2|. The formalism can systematically investigate 
the effect of closed paths, especially generalized clustering coefficient Cm introduced in [21] . 
on six degrees of separation. In this article, we analyze general q-th degrees of separation by 
using the formalism developed by us. So we find that the scale free network with exponent 
7 = 3 just display six degrees of separation. Furthermore we drive a phenomenological relation 
between the separation number q and Cm that has crucial information on circle structures in 
networks. 

keywords: Six Degrees of Separation, String, Clustering Coefficient, Adjacent Matrix, 
Generalized Clustering Coefficient 

1 Introduction 

In 1967, Milgram has made a great impact on the world by advocating "six degrees of separa- 
tion" by a celebrated paper [3] written based on an social experiment. "Six degrees of separation" 
indicates that people have a narrow circle of acquaintances. A series of social experiments made 
by him and his joint researchers [4], [5] made the suggestion, which all people in USA are connected 
through about 6 intermediate acquaintances, more certain. 

The two breakthroughs have made in the end of last century in network theory that declare 
the start of " complex network theory" . One is small world networks that have been proposed 
by Watts and Strogatz , [5] . Another is the scale free networks proposed by Barabasi et al.[5], 
[10] . Many empirical networks exhibit characteristic future of scale free [IT] , [13] , [14] , [15] . Their 
frameworks provided compelling evidence that the small-world phenomenon is pervasive in a range 
of networks arising in nature and technology, and a fundamental ingredient in the evolution of 
the World Wide Web. Furthermore Watts and his coworkers continued to explore six degrees 
of separation [TB],[T7]. We, however, think that the phenomenon, six degrees of separation, is 
not understood well in theoretical point of view. Especially how does the clustering coefficient 
proposed in [7] have an effect on it? If the network of human relations has a tree structure without 
circles, a person connects new persons in power of average degree, when he(she) follows his(her) 
acquaintances step by step on his(her) network of human relations. Then six degrees of separation 
is not so amazing, if a person has a few hundred acquaintances. A question is that networks of 
general human relations include some circles. This structures would decrease the number of new 
persons that connected with him(her) when he(she) follows his(her) acquaintances step by step. 
One of indices characterizing circle structures is the clustering coefficient. Thus it will be important 
to investigate the effect of the clustering coefficient on six degrees of separation. It is, however, 
difficult to investigate the influence of circle structures with general size. There are in fact only a 
little researches focused on the effect of circle structures. 

We have studied it from theoretical point of view with such motives. First we investigated it 
by imposing a homogeneous hypothesis on networks [12]. As a result, we found that the clustering 
coefficient has not any decisive effects on the propagation of information on a network and then 



information easily spread to a lot of people even in networks with relatively large clustering coef- 
ficient under the hypothesis; a person only needs dozens of friends for six degrees of separation. 
Moreover we devoted deep study to the six degrees of separation based on some models proposed 
by Pool and Kochen [B] by using a computer, numerically [T5]. In the article, we estimated the 
clustering coefficient along the method developed by us |18j and improved our analysis of the sub- 
ject through marrying Pool and Kochen's models to our method introduced in [18] . As a result, it 
seems to be difficult that six degrees of separation is realized in the models proposed by Pool and 
Kochcn[6 on the whole. 

The studies was, however, made only under rather restricted conditions on networks. Newman 
studied the influence of circle structures in general networks on the subject [T]. The study is so 
stimulating but only triangle structures and quadrilateral structures on networks were considered. 
It seems to be difficult to generalize his framework to p-polygon that are circles with general size 
p. Recently Aoyama proposed the string formulation for the subject [2]. The idea inspired our 
study in this article, greatly. Although the formalism is available for general networks with any 
circles, he unfortunately tacked the subject only at tree approximation of networks. Since he deals 
with mainly scale free networks with small clustering coefficient, the approximation is valid up to 
a certain point. We developed the string formalism by fusing adjacent matrix formulation so as 
one can analyze six degrees of separation even in networks with general size of circles 20 ,[21 . 

In [20] . the formalism and the justification of it are mainly given, and the formalism and 
analyses of two degrees of separation as preliminary results were reported in [21) . Although we 
also defined the general p-Clustering coefficient C( p ) in [3T], we do not discuss any relation between 
six degrees of separation and Ci p \ yet. In this article we pursue the relations between separation 
number q and Cr p \ as well as general q degrees of separation (where q < 6) in string formulation. 
After that, we show that some phenomenological relation holds. The result naturally reflects the 
effect of circle structures in networks on separation. 

The plan of this article is as follows. After introduction, we briefly review the formalism 
developed in [20 ,[21 in the following section 2. According the formalism, we introduce generalized 
p-th clustering coefficients as well as the usual global one. In the next section 3, q-th degrees of 
separation (where q < 6) in scale free networks [9] , [10] with various values of the exponents based 
on Milgram condition proposed by Aoyama[2]. Though the obtained result is a little different from 
Aoyama's one, it is not contradictory to Aoyama's conjecture crucially. The justification for our 
result is provided by estimating the power A q of an adjacent matrix A. We discuss the relation 
between the separation number q and C( p ) in the section 4. We show a phenomenological relation 
holds there. The last section 5 is devoted to summary. 

2 Review for String Formulation and Adjacent Matrix 

2.1 String Formalism 

We review the formalism given in |20 j .|21 ) , according to the formulation introduced by Aoyama 

m- 

We consider a string-like part of a graph with connected j vertices and call it " j-string" . N is 
the number of vertices in a considering network and Sj is the number of j-string in the network. 
(Note that Sj in this article is N times larger than gAoyama ^ e g ne( j Aoyama[2].) By definition, 
S\ = N and 62 is the number of edges in the network. Sj is the number of non-degenerate j-string 
where a non-degenerate string is defined as strings without any multiple edges and/or any circles 
in the subgraphs as seen in Fig.l. We, however, define that the non-degenerate string contains 
strings homeomorphic to a circle. 

We call strings without any circles as subgraphs and/or whole graphs "open string" and strings 
overall homeomorphic to a circle " closed string" . Thus we consider closed strings and open strings 
in this article. 

It is so difficult to calculate Sj and Sj, generally. It would be maybe impossible to calculate 
Sj and Sj with j > 7 explicitly at the present moment. 

2.2 Generalized Clustering Coefficient 

By using the string formulation, we can defined the usual clustering coefficient which essentially 
counts the number of triangular structures in a network. Although there are some definitions of 
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[a) Degenerate strings 



[b) Nondegenerate strings 



Figure 1: Two types of strings 



the clustering coefficient [7J, Q], we adopt the usual global clustering coefficient C( 3 ) [Ij defined by 



(3) 



6 x number of triangles 6A3 
number of connected triplets S3 



(1) 



where A q is generally the number of polygons with q edges in a network. Some authors have made 
extensions of the clustering coefficient for triangles to that for quadrilaterals. We, however, find 
it is difficult to extend it further to that for circles with larger size. But we need to introduce 
certain indices in order to uncover properties of general polygon structures in networks. From the 
expression of Eq.(l), we can generalize it to p-th clustering coefficient C( p ) straightforwardly; 



(p) 



2p x number of polygons 2pA p 



number of connected p-plets 



(2) 



2.3 Adjacent Matrix Formulation 

We reformulate C( p ) introduced in Eq.(2) by utilizing an adjacent matrix A — (a^). Generally 
the powers, A 2 , A 3 , A 4 , • • • of adjacent matrix A give information as to respecting that a vertex 
connects other vertices through 2,3,4, ••• intermediation edges, respectively. The information 
of the connectivity between two vertices, io and i n , in A n also contains multiplicity of edges, 
generally. For resolving the degeneracy, we introduce a new series of matrices R n which give 
information as to respecting that a vertex connects other vertices through n intermediation edges 
without multiplicity. We can find it by the following formula [20]: 



II (i-<W 



(3) 



where the product of (1 — Si^^ of the numerator plays role of protecting of degeneracies from 
strings and (1 — 5i i n ) of the denominator is, however, needed to keep a closed string. 

This expression has (n — l)-ply loops in a computer program and so it is almost impossible to 
calculate R n within real time for large N. The expansion of Eq.(3) has 2 n{n -^/ 2 terms formally. 
This value is 32768 for n = 6 that is needed for the analysis of six degrees of separation as will be 
discussed in the later section. Though many terms really vanish, R e has still so complex expression. 
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We give the expressions of R 1 ~ R 6 ; 

[R 2 }if = [A 2 } if -[A% i S if = [A 2 } if -G if , 
[R% f = [A% f -{G,A} if + a if , 

[R%f = [A% S - {G, A 2 } lf + {A,diag(A 3 )}. f + 2[A 2 ] lf + [G 2 - G - AGA] lf + 3a lf [A 2 ] lf 
[R%f = [A%f {A,tiag(A*)} if {G,A% f {A 2 , d iag (A 3 )} if + 3([A 2 ] lf f[A] lf 
+ 3[A% f [A] lf + 2{G 2 , A}if + [GAG]if - 6{G,A} lf - {AGA,A} if + 3[A% f 
+ {A,diag(AGA)}. f + 2[diag(A 3 G)] lf - [A ■ diag(A 3 ) ■ A] if - [diag(A 3 )} lf 

+ 3^a lk a kf ([A 2 ] k} + [A 2 ] lk - %[A 2 ] fe/ ) + 4a if (l - a if ), (4) 

k 

where suffix is abbreviate in trivial cases and {•, •} means the anti-commutation relation; {A, B} = 
AB + BA. diagA indicates the diagonal matrix whose elements are the diagonal elements of A, 
and G is the diagonal matrix defined by 



G 



fei 
k 2 
k 3 



(5) 



where fcj is the degree of vertex i. 

R 6 is obtained after straightforward but long tedious calculations. We divide it into the follow- 
ing four parts to brighten the prospects of the calculations. 

[R%f = ^ aijajkakiai m a mn a n fA ik AjiAk m Ai n A m fAuAj m A kn Aif Ai m Aj n A&J A^ n Ajf 

j,k,l,m,n 

= a i jajkakiai m O'mnO'nf AikAjlAkm AinAmfAuAjm Afc n A// A im Aj n Afc/ 

j,k,l.m,n 

- aifafkakiai m a mn a n fAikAfiAkmAi n A m fAuAknAi m 

k,l,m,n 

— ^2 aijajkakiai m a m iaifAikAjiAk m AiiA m fAj m AifAkf 

j,k,l.m 

+ ^2 a if a fkakiainiamiA ik AfiA k7n AiiA m f 7 

k,l.m 

=i? 6 [% + R 6 [2] lf + i? 6 [% + i? 6 [4] l/; (6) 

where A ik = 1 — 6 ik . Furthcmore we divide i? 6 [l]j/ into the following four parts to brighten the 
prospects of the caluculation. 

&ij&jk&kl&lTn&mn®nfAikAjiAk rn Ai n A Tn f AuAjmAknAifAimAjnAkf 

j,k,l,m,n 

= ]T ail a jk a kl a lm a mn a nf A lk A 3l A km A ln A mf A a A 3m A knAifA 

i n 

— 5^ a ij a jk^kl^li^in^nfAikAjiAi n AifAk n AifAj n Akf 

j,k,l,n 

J2 a t] a ]f a fl a lm a mn a nf A lf A 3l A fmAi n AnAj rn Aj n 

j,l,m,n 

+ ^2 a i3 a 3f a fl a li a in a nfAifAjiAi n Aj n A m f, 

=R 6 [1, l] if + R 6 [l, 2] if + R 6 [l, 3] if + R 6 [lA]if- (7) 
The four terms are respectively expressed as follows; 
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R 6 [l, l] if =[A% f + [A% f (4 - (ki + k f )) + [AGA]if(ki + k p ) - {AG A, A 2 } if - [A 2 GA 2 ] 



if 



+ 2[A(G 2 - 3G)A] lf + 3j2^ajka kf [A 2 ] jk - Y} A % 3 - { a ij[A 2 \j P + i^k^f) 

j;k j 

+2j2[A 2 ] ij [A 2 } jf (a ij + a jf ) + [A% f (k 2 + k) - 3(k t + k f ) + 4) 

3 

-[A% } ([A% t + [A 3 ]ff) + ([A% f ) 2 + J2a ij a jf (([A% j + [A a ] fj ) 

3 

-[A%, 2{[A% 3 + [A 2 ] fj ) + [AGA]jj + {{[A% 3 ) 2 + ([A 2 ],,-) 2 )) 

+A if ([A% f {(ki - l)(k f 1) + 1 - [A% f ) ([A% f ) 2 + ^2[A 2 ]ij[A 2 ]jf(a,ij + a jf ) 

+ £ w((([^k) 2 + ^f3?) ([A% 3 + [A 2 ] fj )) j 



3 

+<hf ( 2fc f + h - 5) + [A%i(2ki + k f -5) + [A 2 ] if (11 - 3k t - 3k f ) 

-2j2a ij a jf (([A 2 } ij + [A 2 } fj ))y 
R 6 [l, 2} if +R 6 [l, 3} if = -A tf ([A 2 ] if ([A% + [A% } ) + A[AGA] lf {A 2 , G 2 3G} if {AG A, A 2 } lf 
-4[A 2 ] if -^ aij a jf (([A 2 ] if ) 2 + ([A 2 } if ) 2 ^ {[A 2 ] ij + [A 2 \ fj ) 

+a if (-2[A 2 ] if [A% f + 2[A 2 ] if (k i + k f - 3) + 2^o ij -a j/ ([A 2 ] ij + [A 2 ] n )^j , 

i? 6 [l,% =[A% f A if (([A% f ) 2 3[A% f + 2). (8) 
R 6 [2]if, i? 6 [3]i/ and R 6 [4]if are respectively given by the following expressions; 

i? 6 [% +J R 6 [% = Oi f ^2[A%f - (([A% t + [A 5 ] ff ) 7([A% t + [A% f ) + 22[A 2 ] tf 
+ 4[A 3 ]i / [A 2 ] i/ + 2([A 3 ]^ + [A 3 ]^/fc/) + ^[A 3 ] w ( aj/ + a l3 ) 

3 

-4j2 a ^f{[A 2 } tJ + [A 2 ]fj) - 6{A 2 ,G} tf - 2[AGA]if + {A, AGA} U + {A, AG A} } A , 

3 / 

R 6 [%f =Oi f ([A% f [AGA] if {A 2 ,G} + h[A\ f ([A% + [A 3 ] ff fj . (9) 

By unifying all the terms, we obtain the full expression of R e . Lastly we give the expressions 
of Tr R n appearing in Eq. (7). 
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Tr{R 2 ) = 0, 
Tr(R 3 ) = Tr(A 3 ), 

Tr{R A ) = Tr(A 4 ) - ZTr{GA 2 ), +2Tr{A 2 ) + Tr(G 2 - G), 

Tr(R 5 ) = Tr(A 5 ) - 3Tr(GA 3 ) + 6Tr(A 3 ) - diag(A 3 )Tr(A 2 ) + Ndiag(2A 3 G - A 3 ), 
Tr(R 6 ) = Tr(A 6 ) + 6Tr(A 4 ) - 5Tr(GA 4 ) - 4Tr(A 3 ) + Tr(A 2 G 2 ) - 6Tr(A 2 G) + 4Tr(A 2 ) 

+ 2Tr(AGAG) - ^(a lt ) 2 - Y^ A ^ ai A A \i + 6 Y a *A A %j + Y a^a jk a kl [A\ k . 

(10) 

By using R n , S p and generalized p-th clustering coefricient C( p ) are given by 

S p = (11) 

TtR p 

where the denominator and the numerator indicates the contribution from open strings and a 
closed string, respectively. Thus usual clustering coefficient C( 3 ) becomes 

TrR 3 _ TtA 3 

where we introduced a new symbol | • • • 1 1 which denotes \ \A\ \ = ^ Ajj . 

3 Application to Six Degrees of Separation 

We analyze general q-th degrees of separation based on the formalism developed in the section 2. 
Aoyama has proposed a condition, so-called Milgram Condition, for q-th degrees of separation [2]; 

M?+1 = ^±i~0(iV). (14) 
For six degrees of separation, we obtain from Eq.(6) 

S 7 = ]T(i?%-/2. (15) 

We investigate q-th degrees of separation by using Eq.(4)-(10) and the Milgram Condition. Here 
we place the focus on scale free networks where the degree distribution is P(k) ~ fc~ 7 . The net- 
works can be constructed based on the configuration model [32] . [23] , [21] which can systematically 
produce networks with arbitrary degree distribution. But the networks produced by the model 
are degenerate multigraphs, generally. We modify it a little to produce networks without multiple 
edges. Since it is not essential in this article, we omit the technical details of it. Although Eq. (3) 
reduces to Eq.(4)-(9), we can not estimate the Milgram condition in large scale networks because 
of considerable computational complexity. We can see that the results are stable and reliable while 
estimations are carried out in small networks, 

Fig. 2 shows the relation between log 10 M q /N and q for some 7's where the average degree (k) 
is four and network size N — 200. M q /N increases linearly for every 7 with q. The interior of 
a rectangle in Fig. 2 shows the region where the Milgram condition is satisfied. From Fig. 2, while 
we see the four degrees of separation in networks with 7 < 2.5, we cannot recognize that vertecis 
are linked together in networks with 7 > 3.5 up to six degree of separation. 7 = 2.75 shows five 
degrees of separation and 7 = 3.0, in which many real- world networks have this value of exponent, 
just shows six degrees of separation. 
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Figure 2: Separation number q v.s. M 9 for scale free networks with several 7 
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Table 1: Comparison of our and Aoyama's q for diverse 7 



Comparing these results with Aoyama's ones [2] where we represent the median of the region, 
there is a little difference between both results as shown in Table. 1. Especially, it seems like 
Aoyam's assertion that 7 = 2 is a critical point for two degrees of separation conflicts with our 
result. But Aoyama gives only a region where a separation number exists for every 7 and so we 
take the medians of the region in Table 1. By considering moreover that Aoyama's calculations 
are based on a tree approximation and thus the separation number q is only a estimated one, two 
results are not necessarily inconsistent. Furthermore the estimations depend on how we build up 
networks, in spite of networks with the same 7. 

The fact our result comes closer to Aoyama's one[5] for smaller TV (we do not go into the 
details), is consistent with Aoyam's assertion[2] that the accuracy of his calculations are decreased 
for larger 7. 

We can demonstrate the validity of our results by directly evaluating the ratio r of together 
connected vertices to whole vertices from the power of an adjacent matrix, since the network size 
N is small. Fig. 3 shows the relation between q and r for every 7. When every node connects 
with 50% ~ 60% of vertices in a network, it may be claimed in general that the network is almost 
connected. Taking r > 50% ~ 60% as a borderline, q values derived from it are consistent with 
those estimated from M q in our calculation. Thus the point where M q /N becomes O(l) really 
shows that a majority of the vertices in a network connect each other. 

4 Milgram Condition and Generalized Clustering coefficient 

We explore the relation between Milgram condition and the generalized clustering coefficients in 
this section. By making it, we can analyze how circle structures in a network is related with a 



Figure 3: Separation number q v.s. r for scale free networks 




Figure 4: The sum of Cu,) and M q for the scale network with 7 = 3.0 



separation number q. We define the following two quantities; 

x = 2ZC( P ), 

p=3 

r = iog 10 M 9 . 



(16) 



Fig. 4 shows the relation between X and Y at 7 = 3.0 in the scale free network with AT = 200. We 
can recognize that Y increases linearly with X\ 



Y = AX + B. 



(17) 



Such a relation holds for 1.8 < 7 < 4.0 in common. That is to say, it becomes clear that there 
is the relation of an exponential function between M q /N and the sum of generalized clustering 
coefficients; 



M q - cxp(c^C (p) ), 

p=3 



(18) 



where c is a constant determined by A and B. Thus the separation number q depends greatly on 
the sum of CV p )( p < q ), which represents the state of the circle structures up to q in a network. 
This indicates that the generalized clustering coefficient introduced in this article is an effective 
index to explore q-th degrees of separation. 
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We observe that further relations hold for q and M q /N by drawing a superposed diagram of the 
above-mentioned linear relation for diverse 7's. Fig. 5 is the superposed diagram for 2.0 < 7 < 4.0. 
The linear lines for 2.0 < 7 < 4.0 almost are joined to be a line with an almost common gradient. 
This means that q depends only on the generalized clustering coefficient and not on 7, directly. 
Thus the exponent in scale free networks is not crucial for the separation number but the state of 
circle structures in networks is essential. 

The reason why the relations holds is outstanding issue and only a phenomenological relation 
at present. 



5 Summary 

In this article, we first introduced the generalized clustering coefficient, which has information on 
the state of circle structures in a network, based on the string formulation proposed by |2j to analyze 
networks. Fusing adjacent matrix A into the formalism, we reformulate the string formalism to 
define the generalized q-th clustering coefficient in a compact way [20], [21]. Then we introduce the 
R matrix in the formalism developed in this article instead of A. The powers of R play central 
role in the analysis of this article. The explicit representations of R n for n = 2 ~ 6 are given after 
straightforward but tedious calculations. 

Next we applied the formulation to the subject of q-th, especially q = 6, degrees of separation. 
We evaluated whether Milgram condition proposed by Aoyama's article holds or not for diverse 
exponents in scale free networks. We find that as the exponent 7 is larger, so it is more difficult 
that Milgram condition holds. The six degrees of separation is just founded at 7 = 3 whose value 
is fairly universally observed in real- world networks. 

We also find that the result seems to be a little different from Aoyama's one [2]. We think that 
it does not mean necessarily inconsistency, considering that Aoyama's evaluation is based on tree 
approximation and furthermore the way to construct networks is maybe different (Aoyama does 
not explain the way to construct networks and the construction of networks in this article include 
some original way in avoiding multiedges ). Our results is also supported by analyzing the number 
of zero-components in A n . 

The our construction is based on the configuration mo del [22]. [23], [24] with average degree 
< k >= 4. According to some sociologists, the estimated average number of acquaintances of a 
person is 290 [26] , [27] , [28] . Considering this estimation, the separation number would really take 
smaller values for every exponent. 

The following problems are yet left in future: 

1. Finding explicit expressions of R n for arbitrary n by applying our formalism. Then finding 
a general formula for R n . 
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2. Revealing relations between q-th degrees of separation and N, (k) or < k n >. More definitely, 
discovering the relations between q and N, (k) or < k n >. 

3. The reason why the relations (18) holds is outstanding issue. So finding some theoretical 
reasons for phenomenological relations between the separation number and various circle 
structures, especially Ci q y 

4. Attempt whether this relation holds or not in other networks, especially small world networks 
which can at least control the usual clustering coefficient by construction. 
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